feat(#79): Code application pipeline — parse, write to workspace, load into sandbox#82
Conversation
…t output New module: src/tools/code_parser.py - Parses # --- FILE: path --- markers into ParsedFile objects - Path normalization with security checks (traversal, absolute paths) - Workspace containment validation (defense-in-depth) - Pure functions, zero side effects, fully testable
Tests cover: - Path normalization (10 tests): simple, dot-slash, backslash, traversal, absolute - Multi-file parsing (16 tests): markers, no-markers fallback, empty files, duplicates, nested paths, mixed types, large output - Workspace validation (3 tests): containment, symlinks, empty input
Tests cover: - Happy path: files parsed and stored in state - Workspace writes: files written to disk with nested dirs - Graceful skip: no code, no blueprint - Security: path traversal skipped - Trace entries added for dashboard visibility - Status unchanged (pass-through node)
Orchestrator changes: - New import: code_parser (parse_generated_code, validate_paths_for_workspace) - New helper: _get_workspace_root() reads WORKSPACE_ROOT env var - New state field: parsed_files: list[dict] in GraphState and AgentState - New node: apply_code_node -- parses generated_code, validates paths, writes files to workspace, stores parsed_files in state - Updated graph: developer -> apply_code -> sandbox_validate - Updated _run_sandbox_validation: accepts parsed_files, passes as project_files to E2B runner (files loaded before validation commands) - Updated sandbox_validate_node: reads parsed_files from state, logs file loading step - Updated initial_state: includes parsed_files: [] Graph flow now: START -> architect -> developer -> apply_code -> sandbox_validate -> qa -> (conditional)
…te nodes Runner changes: - New INFRA_NODES set for infrastructure nodes (apply_code, sandbox_validate, flush_memory) - New _handle_infra_node method: handles non-agent nodes with SSE events - apply_code: emits timeline entry with file count/chars, logs each file path - sandbox_validate: emits timeline entry with test results, logs pass/fail - Updated initial_state: includes parsed_files: [] - Updated _handle_node_completion: routes infra nodes to new handler
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughAdds a code-application stage: parse Dev agent output into files, validate and write them into a local workspace, load those files into the sandbox for validation, and emit task-progress and timeline events for new infra nodes (apply_code, sandbox_validate, flush_memory). Changes
Sequence Diagram(s)sequenceDiagram
participant DevAgent as Dev Agent
participant Orchestrator as Orchestrator
participant CodeParser as Code Parser
participant Workspace as Workspace
participant E2BSandbox as E2B Sandbox
participant Validator as Validator
DevAgent->>Orchestrator: submit generated_code (text with markers)
Orchestrator->>CodeParser: parse_generated_code(generated_code)
CodeParser-->>Orchestrator: list[ParsedFile]
Orchestrator->>Orchestrator: validate_paths_for_workspace(parsed_files)
Orchestrator->>Workspace: write validated files to workspace
Workspace-->>Orchestrator: acknowledge files written
Orchestrator->>E2BSandbox: load parsed_files into sandbox
E2BSandbox-->>Orchestrator: acknowledge files loaded
Orchestrator->>Validator: trigger sandbox_validate(parsed_files)
Validator->>E2BSandbox: run validation commands
E2BSandbox-->>Validator: test results
Validator-->>Orchestrator: validation complete (emit timeline/progress)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
…andbox flow The existing test_graph_edge_developer_to_sandbox asserted developer connects directly to sandbox_validate, but the new graph inserts apply_code between them. Updated test: - Renamed to test_graph_edge_developer_to_apply_code - Added test_graph_edge_apply_code_to_sandbox - Added test_graph_has_apply_code_node - Full chain: developer -> apply_code -> sandbox_validate -> qa verified
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (3)
dev-suite/src/tools/code_parser.py (3)
61-65: Consider adding explicit rejection of Windows-style absolute paths.The current check only catches Unix absolute paths (
/etc/passwd). Windows paths likeC:\secret.txtbecomeC:/secret.txtafter normalization and pass this check. Whilevalidate_paths_for_workspacewould catch this downstream, rejecting it here provides clearer error messages and earlier failure.🔧 Optional enhancement
# Reject absolute paths if cleaned.startswith("/"): raise CodeParserError( f"Absolute path not allowed: '{raw_path}'" ) + + # Reject Windows-style absolute paths (e.g., C:/ or D:\) + if len(cleaned) >= 2 and cleaned[1] == ":" and cleaned[0].isalpha(): + raise CodeParserError( + f"Absolute path not allowed: '{raw_path}'" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dev-suite/src/tools/code_parser.py` around lines 61 - 65, The current rejection only checks Unix-style absolute paths via cleaned.startswith("/") so add an explicit Windows absolute-path check (drive letter followed by :/ or :\) and raise CodeParserError with a clear message using raw_path; e.g., detect with a regex like r'^[A-Za-z]:[\\/]' against cleaned (or raw_path) and raise the same CodeParserError to fail early (this complements validate_paths_for_workspace). Update the block that references cleaned and raw_path to include this additional condition and error.
186-193: Move logging import to module level.The
import logginginside the function is unconventional. Since logging is a standard library module with no circular import risk, it should be imported at the top of the file alongside other imports.♻️ Move import to module level
At the top of the file (around line 14):
import loggingThen remove lines 187-188 from inside the function.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dev-suite/src/tools/code_parser.py` around lines 186 - 193, The inline "import logging" inside the function should be moved to the module level; add "import logging" with the other imports at top of the file and remove the in-function import lines so the existing logging.getLogger(__name__).warning(...) call (which logs "Skipping file with path escaping workspace: %s -> %s" and references pf.path and target) continues to work without a local import.
145-156: Minor inefficiency: paths are normalized twice.Each path is normalized once at line 123 (stored in
seen) and again at lines 150 when buildingordered_paths. Since the normalized paths are already stored as keys inseen, you could simplify by tracking insertion order directly.♻️ Optional simplification using ordered dict semantics
Since Python 3.7+ dicts maintain insertion order, you could track first-seen order directly:
# Extract files between markers - seen: dict[str, ParsedFile] = {} + seen: dict[str, ParsedFile] = {} # Maintains insertion order (Python 3.7+) + first_seen_order: list[str] = [] for i, match in enumerate(markers): raw_path = match.group(1) path = _normalize_path(raw_path) + if path not in seen: + first_seen_order.append(path) + # Content starts after this marker's line # ... # Last occurrence wins for duplicate paths seen[path] = ParsedFile(path=path, content=content) - # Return in order of first appearance, but with last-wins content - ordered_paths: list[str] = [] - for match in markers: - raw_path = match.group(1) - try: - path = _normalize_path(raw_path) - except CodeParserError: - continue - if path not in ordered_paths: - ordered_paths.append(path) - - return [seen[p] for p in ordered_paths if p in seen] + return [seen[p] for p in first_seen_order]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dev-suite/src/tools/code_parser.py` around lines 145 - 156, The code currently normalizes paths twice: once when populating seen and again when building ordered_paths; instead, track first-seen order when you populate seen so you can return in first-appearance order without re-normalizing. Update the code that assigns into seen (the block that calls _normalize_path and sets seen[path]=...) to also append path to an ordered_paths list when the key is first added, remove the second normalization loop that iterates markers and calls _normalize_path again, and change the final return to use the ordered_paths list and seen (i.e., return [seen[p] for p in ordered_paths if p in seen]); reference variables/functions: seen, markers, _normalize_path, ordered_paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@dev-suite/src/api/runner.py`:
- Around line 274-276: The log message shows "None passed, None failed" when
passed/failed are None; update the self._emit_log call (in the same block that
sets the action variable) to use the same guarded formatting as action—e.g.,
replace the direct {passed} and {failed} with a fallback like {passed or '?'}
and {failed or '?'} (or explicit None checks) so the log reads "[sandbox:locked]
Validation: X passed, Y failed" even when values are None.
In `@dev-suite/src/tools/code_parser.py`:
- Around line 183-195: The current containment check using
str(target).startswith(str(resolved_root)) can produce false-positives; replace
it with a proper Path containment test by attempting
target.relative_to(resolved_root) inside a try/except (catch ValueError) to
detect escaping paths, log and continue on failure, and only append pf to
safe_files when relative_to succeeds; update the block that computes target from
(resolved_root / pf.path).resolve() and the logging call that references pf.path
and target to use this new containment test.
In `@dev-suite/tests/test_apply_code.py`:
- Around line 107-122: The test test_path_traversal_skipped currently only
supplies "src/safe.py" so it doesn't exercise path traversal; update the
dangerous_code in the test to include at least one file header that uses a
traversal/absolute path (e.g., "# --- FILE: ../../outside.py ---" or "# ---
FILE: /etc/passwd ---") alongside the safe file, then call
apply_code_node(state) as before and assert that the traversal/absolute entry is
omitted from result["parsed_files"] while "src/safe.py" remains present; keep
the patching of _get_workspace_root and the existing assertions for parsed_files
but add an assertion that no parsed file has a path that resolves outside the
workspace (or that the length equals 1 and the only path is "src/safe.py").
In `@dev-suite/tests/test_code_parser.py`:
- Around line 249-266: The test test_symlink_escape_filtered should assert that
the symlink that resolves outside the workspace is filtered out by
validate_paths_for_workspace; replace the weak assertion with a strict check
that only the safe file remains (e.g., assert len(result) == 1 and that the
remaining ParsedFile.path is "src/main.py") so the test fails if the sneaky
ParsedFile is not removed. Ensure you reference the ParsedFile entries returned
by validate_paths_for_workspace when asserting the remaining path.
---
Nitpick comments:
In `@dev-suite/src/tools/code_parser.py`:
- Around line 61-65: The current rejection only checks Unix-style absolute paths
via cleaned.startswith("/") so add an explicit Windows absolute-path check
(drive letter followed by :/ or :\) and raise CodeParserError with a clear
message using raw_path; e.g., detect with a regex like r'^[A-Za-z]:[\\/]'
against cleaned (or raw_path) and raise the same CodeParserError to fail early
(this complements validate_paths_for_workspace). Update the block that
references cleaned and raw_path to include this additional condition and error.
- Around line 186-193: The inline "import logging" inside the function should be
moved to the module level; add "import logging" with the other imports at top of
the file and remove the in-function import lines so the existing
logging.getLogger(__name__).warning(...) call (which logs "Skipping file with
path escaping workspace: %s -> %s" and references pf.path and target) continues
to work without a local import.
- Around line 145-156: The code currently normalizes paths twice: once when
populating seen and again when building ordered_paths; instead, track first-seen
order when you populate seen so you can return in first-appearance order without
re-normalizing. Update the code that assigns into seen (the block that calls
_normalize_path and sets seen[path]=...) to also append path to an ordered_paths
list when the key is first added, remove the second normalization loop that
iterates markers and calls _normalize_path again, and change the final return to
use the ordered_paths list and seen (i.e., return [seen[p] for p in
ordered_paths if p in seen]); reference variables/functions: seen, markers,
_normalize_path, ordered_paths.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d1842ff3-192e-4b3c-8440-eecd9fa1b513
📒 Files selected for processing (5)
dev-suite/src/api/runner.pydev-suite/src/orchestrator.pydev-suite/src/tools/code_parser.pydev-suite/tests/test_apply_code.pydev-suite/tests/test_code_parser.py
Fixes: - [Major] Use relative_to() instead of startswith() for path containment (prevents sibling-directory false positive: /workspace vs /workspace2) - [Nitpick] Add Windows absolute path rejection (C:/ D:\) - [Nitpick] Move logging import to module level - [Nitpick] Eliminate double normalization — track first_seen_order during initial parse loop instead of re-iterating markers
- [Major] test_path_traversal_skipped now includes actual ../../../ path (parser raises CodeParserError, node catches it gracefully) - Added test_workspace_containment_filters_unsafe for the second defense layer
- [Minor] Tighten symlink test assertion: == 1 with path check, not >= 1 - Added test_rejects_windows_absolute_path (C:\) - Added test_rejects_windows_absolute_path_forward_slash (D:/)
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@dev-suite/tests/test_sandbox_validate.py`:
- Around line 209-221: The test test_graph_edge_developer_to_apply_code
currently only asserts "apply_code" is present but doesn't prevent a regression
adding a direct developer -> sandbox_validate edge; update the test to both
require "apply_code" and forbid "sandbox_validate" by inspecting the built graph
(use build_graph(), compiled.get_graph(), iterate graph_data.edges to form
developer_targets from e.target where e.source == "developer") and assert that
"apply_code" is in developer_targets and "sandbox_validate" is not in
developer_targets (or assert developer_targets == {"apply_code"} if the intent
is exclusivity).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: deb0f9d1-ae7d-4666-af2e-61c7b50b3053
📒 Files selected for processing (1)
dev-suite/tests/test_sandbox_validate.py
| def test_graph_edge_developer_to_apply_code(self): | ||
| """developer should connect to apply_code (not directly to sandbox).""" | ||
| from src.orchestrator import build_graph | ||
|
|
||
| graph = build_graph() | ||
| compiled = graph.compile() | ||
| graph_data = compiled.get_graph() | ||
| # Check edges: developer -> sandbox_validate | ||
| developer_targets = { | ||
| e.target for e in graph_data.edges | ||
| if e.source == "developer" | ||
| } | ||
| assert "sandbox_validate" in developer_targets | ||
| assert "apply_code" in developer_targets | ||
|
|
There was a problem hiding this comment.
Strengthen edge assertion to match test intent.
Line 220 only checks that apply_code is present. It does not enforce the “not directly to sandbox” part, so a regression adding developer -> sandbox_validate would still pass.
✅ Suggested assertion hardening
developer_targets = {
e.target for e in graph_data.edges
if e.source == "developer"
}
assert "apply_code" in developer_targets
+ assert "sandbox_validate" not in developer_targets📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def test_graph_edge_developer_to_apply_code(self): | |
| """developer should connect to apply_code (not directly to sandbox).""" | |
| from src.orchestrator import build_graph | |
| graph = build_graph() | |
| compiled = graph.compile() | |
| graph_data = compiled.get_graph() | |
| # Check edges: developer -> sandbox_validate | |
| developer_targets = { | |
| e.target for e in graph_data.edges | |
| if e.source == "developer" | |
| } | |
| assert "sandbox_validate" in developer_targets | |
| assert "apply_code" in developer_targets | |
| def test_graph_edge_developer_to_apply_code(self): | |
| """developer should connect to apply_code (not directly to sandbox).""" | |
| from src.orchestrator import build_graph | |
| graph = build_graph() | |
| compiled = graph.compile() | |
| graph_data = compiled.get_graph() | |
| developer_targets = { | |
| e.target for e in graph_data.edges | |
| if e.source == "developer" | |
| } | |
| assert "apply_code" in developer_targets | |
| assert "sandbox_validate" not in developer_targets |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@dev-suite/tests/test_sandbox_validate.py` around lines 209 - 221, The test
test_graph_edge_developer_to_apply_code currently only asserts "apply_code" is
present but doesn't prevent a regression adding a direct developer ->
sandbox_validate edge; update the test to both require "apply_code" and forbid
"sandbox_validate" by inspecting the built graph (use build_graph(),
compiled.get_graph(), iterate graph_data.edges to form developer_targets from
e.target where e.source == "developer") and assert that "apply_code" is in
developer_targets and "sandbox_validate" is not in developer_targets (or assert
developer_targets == {"apply_code"} if the intent is exclusivity).
Guard sandbox validation log against None values for passed/failed counts using `or '?'` fallback, matching the action string pattern.
Summary
Closes #79
Bridges the gap between the Dev agent's text output and actual file changes. Parsed code now reaches the filesystem and the E2B sandbox before validation commands run.
What Changed
New: Code Parser (
src/tools/code_parser.py)# --- FILE: path/to/file ---markers from Dev agent output intoParsedFileobjects../), absolute paths (/etc/passwd)New:
apply_code_node(orchestrator)developerandsandbox_validategenerated_code→ validates paths → writes files to workspace → storesparsed_filesin state_get_workspace_root()(readsWORKSPACE_ROOTenv var, defaults to cwd)Updated: Graph Flow
Updated: Sandbox File Loading
_run_sandbox_validation()now acceptsparsed_filesparameterproject_filesdict forE2BRunner.run_tests()Updated: TaskRunner SSE Integration
INFRA_NODESset for infrastructure nodes (apply_code, sandbox_validate)_handle_infra_node()method with SSE events for dashboard visibilityapply_code: emits timeline entry with file count + chars, logs each file pathsandbox_validate: emits timeline entry with test resultsinitial_stateincludesparsed_files: []New State Field
Test Results
37 new tests, all passing:
test_code_parser.py: 29 tests (path normalization, multi-file parsing, edge cases, security)test_apply_code.py: 8 tests (happy path, workspace writes, graceful skips, security, trace entries)Files Changed
src/tools/code_parser.pytests/test_code_parser.pytests/test_apply_code.pysrc/orchestrator.pysrc/api/runner.pyAcceptance Criteria
# --- FILE:formatted outputapply_code_nodewrites parsed files to workspacesandbox_validate_nodeloads parsed files into E2B sandbox before running commands_handle_node_completionSummary by CodeRabbit
New Features
Improvements
Tests